Suffix Arrays for Structural Strings

نویسندگان

  • Richard Beal
  • Don Adjeroh
چکیده

The structural match (s-match), originally addressed by the structural suffix tree, helps identify different RNA sequences with the same secondary structure. In this work, we introduce and construct the structural suffix array and structural longest common prefix array, i.e. lightweight suffix data structures for the s-match. Further, we illustrate how to use our data structures to support additional RNA pattern matching problems beyond the s-match.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Counting Suffix Arrays and Strings

Suffix arrays are used in various application and research areas like data compression or computational biology. In this work, our goal is to characterize the combinatorial properties of suffix arrays and their enumeration. For fixed alphabet size and string length we count the number of strings sharing the same suffix array and the number of such suffix arrays. Our methods have applications to...

متن کامل

Lightweight Parameterized Suffix Array Construction

We present a first algorithm for direct construction of parameterized suffix arrays and parameterized longest common prefix arrays for non-binary strings. Experimental results show that our algorithm is much faster than näıve methods.

متن کامل

Faster Suffix Tree Construction with Missing

We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new backpropagation component to McCreight’s algorithm and also give a high probability hashing scheme for large degre...

متن کامل

Computing Longest Common Substrings Via Suffix Arrays

Given a set of N strings A = {α1, . . . , αN} of total length n over alphabet Σ one may ask to find, for a fixed integer K, 2 ≤ K ≤ N , the longest substring β that appears in at least K strings in A. It is known that this problem can be solved in O(n) time with the help of suffix trees. However, the resulting algorithm is rather complicated. Also, its running time and memory consumption may de...

متن کامل

String Inference from the LCP Array

The suffix array, perhaps the most important data structure in modern string processing, is often augmented with the longest common prefix (LCP) array which stores the lengths of the LCPs for lexicographically adjacent suffixes of a string. Together the two arrays are roughly equivalent to the suffix tree with the LCP array representing the tree shape. In order to better understand the combinat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014